The high-availability improvements in Exchange 2010
make it even easier to deploy cross-site failover solutions without a
need for third-party network and storage solutions. The secondary site
can be used to handle primary site outages resulting from maintenance or
other, more serious failures. Even with the improvements in Exchange
2010, careful planning must be done to successfully deploy and maintain a
multi-site deployment.
1. Cross-site DAG Considerations
The primary building block of a
cross-site solution is the cross-site DAG. Extending a DAG between
sites does have a couple requirements, including the following:
Fewer than 250
milliseconds of latency between all DAG members. To ensure consistent
DAG operations there should be minimal latency.
At least one domain controller in each site. Exchange requires a domain controller in each site it is deployed; for redundancy at least two should be deployed.
At
least one Client Access server in each site. To provide client
connectivity to both sites at least one Client Access server must be
deployed; for redundancy at least two should be deployed.
At least one Hub Transport server in each site. To provide e-mail transport to both sites at least one Hub Transport must be deployed; for redundancy at least two should be deployed.
Consider
the impact on supporting services to a failover. The appropriate number
and configure of Client Access servers, Hub Transport, Edge Transport,
Unified Messaging server roles, and domain controllers must be located
at each site to support the maximum number of active mailboxes.
In the case of a complete datacenter failure:
Quorum
must be reestablished. To mount databases, a quorum must be established
within the cluster. If a majority of the members, including the file
share witness, are unavailable the DAG must be manually reconfigured to
reestablish quorum.
Manual
switchover process. To bring up the second site, the administrator must
manually initiate the switchover. A complete datacenter switchover is
not something to consider lightly from a business process standpoint.
Requiring manual intervention was put in place to ensure that an
administrator has to make the decision to initiate a full datacenter
switchover.
2. Cross-site Considerations for Client Access and Transport
When you deploy non-Mailbox servers to support a cross-site
failover, you might come across several issues, including Domain Name
System (DNS) entries for Outlook Web App, Outlook Anywhere, and
Autodiscover. Inbound e-mail (MX) must be redirected to reflect the
secondary site's IP addresses. These record changes should be automated
to provide the quickest return to service. Until the clients that
connect to these services have the new addresses they will fail. These
changes can be improved by deploying DNS servers in multiple locations
or by using third-party global-server load balancing. If you are using a
hosted anti-spam or archiving service these services must be redirected
to the new site.
Proper namespace planning is
needed for the failover process to run smoothly. To do this you must
consider each datacenter as being active and choose a unique set of
names for each Exchange service. This includes OWA, Post Office Protocol
version 3 (POP3), Internet Message Access Protocol version 4 (IMAP4),
Exchange Web Services, and Outlook Anywhere; however, it cannot include
Autodiscover. Having this number of names requires that you configure
certificates to reflect the names that each site uses. To do this,
ensure that the certificates contain all required host names for
services in both datacenters or use a wildcard certificate. If you
choose to use separate certificates for each datacenter, you must ensure
that each certificate has the same certificate principal name. To
reduce the impact on Outlook connections, you must run Set-OutlookProvider EXPR -CertPrincipalName msstd:<certificate principal name>.
Gary A. Cooper
Senior Systems Architect, Horizons Consulting, Inc., United States
In previous versions of
Exchange Server, when thinking of high availability and site resiliency,
we often thought only of how to protect the mailbox database and how to
make it available in another datacenter in the event that something
happened to your primary copy. Although database availability and the
DAG are still important factors in Exchange Server 2010, it is now
equally important to consider the Client Access Server role and the
overall namespace design and its impact on your high availability and
site resiliency plan.
To account for the impact
the namespace design has on availability, it is helpful to think about
the different switchover/failover (*over) scenarios and the impact those
*over scenarios have on all of the client connectivity types that your
organization needs to support. When the namespace design has been drawn
out, I recommend deploying the design in a lab environment so that the
*over scenarios can be played out and the client types supported by the
organization can be fully tested to gain the impact on users. It is
important to note whether the client will continue to run without
interruption or will experience a brief disconnect and then
automatically reconnect. Possibly, the client will reconnect, but only
after a timeout value has been exceeded (for example: DNS resolver cache
expiring). During the testing phase, you can also work out any
intervention steps you must take to ensure a smoother transition during a
failure.
After you have fully tested
the client impact, it is important to document the results both for your
design documentation and so that you can articulate the results to both
your senior management and to the user community at large. In this way,
you can set everyone's expectations properly and avoid confusion in the
event that the unthinkable disaster happens.
To visualize the different
scenarios, it is often helpful to build a chart that allows you to track
the success or failure of each client connection type given specific
*over scenarios.
| HIGH AVAILABILITY(SINGLE-SITE AND SINGLE-NAMESPACE) | SITE RESILIENCY(TWO-SITE AND TWO-NAMESPACE) |
---|
CLIENT TYPE | SWITCHOVER | FAILOVER | SWITCHOVER | FAILOVER |
---|
OWA | No user impact (Success) | No user impact (Success) | No user impact (Success) | No user impact (Success) | Exchange ActiveSync 5/6 | No user impact (Success) | No user impact (Success) | Client failure and profile must be manually updated (Failure) | No user impact (Success) | Exchange ActiveSync 6.1+ | No user impact (Success) | No user impact (Success) | No user impact (Success) | No user impact (Success) | Outlook 2007/2010 (Outlook Anywhere) | Short client disconnect and reconnect (Success) | Short client disconnect and reconnect (Success) | No user impact (Success) | No user impact (If EXPR matches certificate CN) (Success) | Outlook 2007/2010 (Internal RPC) | Short client disconnect and reconnect(Success) | Short client disconnect and reconnect (Success) | No user impact (Success) | No user impact (If EXPR matches certificate CN) (Success) | POP3/IMAP4 | No user impact (Success) | No user impact (Success) | Client failure and profile must be manually updated (Failure) | No user impact (Success) |
|